NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations

Koprulu, Cevahir; Li, Po-Han; Qiu, Tianyu; Zhao, Ruihan; Westenbroek, Tyler; Fridovich-Keil, David; Chinchali, Sandeep; Topcu, Ufuk (June 2025, 7th Annual Learning for Dynamics & Control Conference)

Free, publicly-accessible full text available June 6, 2026
Should We Use Model-Free or Model-Based Control? A Case Study of Battery Control

https://doi.org/10.1109/NAPS61145.2024.10741791

El_Hajj_Chehade, Mohamad Fares; Cho, Young-Ho; Chinchali, Sandeep; Zhu, Hao (October 2024, IEEE)

Full Text Available
Joint learning of reward machines and policies in environments with partially known semantics

https://doi.org/10.1016/j.artint.2024.104146

Verginis, Christos K; Koprulu, Cevahir; Chinchali, Sandeep; Topcu, Ufuk (August 2024, Artificial Intelligence)

We study the problem of reinforcement learning for a task encoded by a reward machine. The task is defined over a set of properties in the environment, called atomic propositions, and represented by Boolean variables. One unrealistic assumption commonly used in the literature is that the truth values of these propositions are accurately known. In real situations, however, these truth values are uncertain since they come from sensors that suffer from imperfections. At the same time, reward machines can be difficult to model explicitly, especially when they encode complicated tasks. We develop a reinforcement-learning algorithm that infers a reward machine that encodes the underlying task while learning how to execute it, despite the uncertainties of the propositions’ truth values. In order to address such uncertainties, the algorithm maintains a probabilistic estimate about the truth value of the atomic propositions; it updates this estimate according to new sensory measurements that arrive from exploration of the environment. Additionally, the algorithm maintains a hypothesis reward machine, which acts as an estimate of the reward machine that encodes the task to be learned. As the agent explores the environment, the algorithm updates the hypothesis reward machine according to the obtained rewards and the estimate of the atomic propositions’ truth value. Finally, the algorithm uses a Q-learning procedure for the states of the hypothesis reward machine to determine an optimal policy that accomplishes the task. We prove that the algorithm successfully infers the reward machine and asymptotically learns a policy that accomplishes the respective task.
more » « less
Full Text Available
ControlPay: An Adaptive Payment Controller for Blockchain Economies

https://doi.org/10.1109/Blockchain62396.2024.00048

Akcin, Oguzhan; Streit, Robert P; Oommen, Benjamin; Vishwanath, Sriram; Chinchali, Sandeep (August 2024, IEEE)

Full Text Available
Safe Networked Robotics With Probabilistic Verification

https://doi.org/10.1109/LRA.2023.3340525

Narasimhan, Sai Shankar; Bhat, Sharachchandra; Chinchali, Sandeep P. (March 2024, IEEE Robotics and Automation Letters)

Full Text Available
Learning Adaptive Horizon Maps Based on Error Forecast for Model Predictive Control

https://doi.org/10.1109/CDC49753.2023.10384131

Gonzalez, Carlos; Bang, Seung Hyeon; Li, Po-han; Chinchali, Sandeep; Sentis, Luis (December 2023, IEEE)

Full Text Available
Task-aware Distributed Source Coding under Dynamic Bandwidth

Li, Po-han; Ankireddy, Sravan Kumar; Zhao, Ruihan; Mahjoub, Hossein; Pari, Ehsan; Topcu, Ufuk; Chinchali, Sandeep; Kim, Hyeji (February 2024, Advances in Neural Information Processing Systems)
Fleet Active Learning: A Submodular Maximization Approach

Akcin, Oguzhan; Unuvar, Orhan; Ure, Onat; Chinchali, Sandeep (August 2023, Conference on Robot Learning (CORL))

In multi-robot systems, robots often gather data to improve the performance of their deep neural networks (DNNs) for perception and planning. Ideally, these robots should select the most informative samples from their local data distributions by employing active learning approaches. However, when the data collection is distributed among multiple robots, redundancy becomes an issue as different robots may select similar data points. To overcome this challenge, we propose a fleet active learning (FAL) framework in which robots collectively select informative data samples to enhance their DNN models. Our framework leverages submodular maximization techniques to prioritize the selection of samples with high information gain. Through an iterative algorithm, the robots coordinate their efforts to collectively select the most valuable samples while minimizing communication between robots. We provide a theoretical analysis of the performance of our proposed framework and show that it is able to approximate the NP-hard optimal solution. We demonstrate the effectiveness of our framework through experiments on real-world perception and classification datasets, which include autonomous driving datasets such as Berkeley DeepDrive. Our results show an improvement by up to 25.0% in classification accuracy, 9.2% in mean average precision and 48.5% in the submodular objective value compared to a completely distributed baseline.
more » « less
Task-aware Privacy Preservation for Multi-dimensional Data

Cheng, Jiangnan; Tang, Ao; Chinchali, Sandeep. (January 2022, Proceedings of Machine Learning Research)

Full Text Available
Data Sharing and Compression for Cooperative Networked Control

Cheng, Jiangnan; Pavone, Marco; Sachin; Chinchali, Sandeep; Tang, Ao. (January 2021, Advances in neural information processing systems)

Full Text Available

Search for: All records